The impact of whole-genome duplications in the topology of angiosperm gene regulatory networks

Fabricio Almeida-Silva and Yves Van de Peer

VIB-UGent Center for Plant Systems Biology

Whole-genome duplications (WGD)

WGD or poliploidy: duplication of an organism’s entire set of chromosomes.

Key source of extra genetic material for evolution to work with.

WGD events have occurred in multiple taxa, e.g.:

  • vertebrates (2 WGD + 1 WGD shared by all teleosts)
  • yeasts
  • plants: multiple independent events!

Whole-genome duplications (WGD) in plants

WGD have contributed to:

  • radiation of important families
  • increased diversity
  • survival in stressful times

Poliploidy: an evolutionary dead end?

Survival and establishment of polyploids is challenging.

Detrimental effects of WGD include:

  • reduced fertility
  • genomic shock

Surviving polyploids undergo a rediploidization process that leads to genome fractionation (i.e., loss of functional DNA sequences).

Biased retention of duplicated genes

Preferential retention of genes encoding proteins involved in intricately connected systems, e.g.:

  • transcription factors (TFs)
  • kinases
  • members of multiprotein complexes

The gene balance hypothesis: preservation of stoichiometric balance explains the biased retention.

From genes to networks

Using TFs to study the impact of genome duplications.

TF activity can be explored globally in gene regulatory networks (GRNs).

Network motifs: the building blocks of complex systems

Network motifs are genetic circuits that have been positively selected.

Gene and genome duplications can create novel motifs.


What is the impact of gene and genome duplications in the topology of angiosperm GRNs?

Data overview and summary stats

Data source:

  • Proteomes, CDS, and genome annotation: Ensembl Plants release 53
  • PPI data (physical links, confidence > 0.5): STRING
  • RNA-seq data: EBI’s Expression Atlas

Methods: Network inference


  1. Prediction of TFs: planttfhunter
    • profile HMM search using the PlantTFDB scheme
  2. GRN inference: BioNERO
    • GENIE3 algorithm

Methods: Finding and counting motifs


  1. Duplicate identification and substitution rates: doubletrouble
    • Classification as WGD- and SSD-derived genes
    • Ka, Ks, and Ka/Ks using the MYN model

Methods: Finding and counting motifs

Methods: Finding and counting motifs


  1. Duplicate identification and substitution rates: doubletrouble
    • Classification as WGD- and SSD-derived genes
    • Ka, Ks, and Ka/Ks using the MYN model
  2. Motif counting and significance assessment: magrene

Methods: Finding and counting motifs

PPI networks are enriched in WGD-derived genes

Enrichment of WGD-derived genes in the PPI networks of all species (P < 0.001).

WGD-derived genes in PPI networks are enriched in dosage sensitive processes, e.g.:

  • signal transduction
  • transcriptional regulation
  • translation
  • cell wall biogenesis
  • redox homeostasis
  • lipid metabolism

Conclusion

Our findings agree with the gene balance hypothesis - association between WGD and protein-protein interaction.

Sequence divergence is constrained in interacting ohnologs

Sequence divergence is constrained in interacting ohnologs

Conclusion

Dosage balance imposes selective pressures that constrain sequence divergence in interacting WGD-derived genes.

WGD-derived duplicates tend to interact with the same partners

Measuring interaction similarity:

\[ S(A,B) = \frac{2 \left| A \cap B \right|}{ \left|A \right| + \left| B \right|} \]

WGD-derived pairs have higher interaction similarity than SSD-derived pairs.

The difference is more pronounced for older pairs.

WGD-derived duplicates tend to interact with the same partners

Measuring interaction similarity:

\[ S(A,B) = \frac{2 \left| A \cap B \right|}{ \left|A \right| + \left| B \right|} \]

WGD-derived pairs have higher interaction similarity than SSD-derived pairs.

The difference is more pronounced for older pairs.

Conclusion

Dosage balance imposes selective pressures that prevent ohnologs from losing and gaining interactions.

(Recent) WGD fuel(ed) the emergence of network motifs

Genes from recent WGD are more frequently part of motifs than genes from ancient WGD.

WGD-derived motifs are quickly lost over time (fractionation or rewiring?)

(Recent) WGD fuel(ed) the emergence of network motifs

Species with recent WGD events generally have higher motif frequencies, regardless of the duplication mode that created the genes forming motifs.

(Recent) WGD fuel(ed) the emergence of network motifs

Species with recent WGD events generally have higher motif frequencies, regardless of the duplication mode that created the genes forming motifs.

Conclusion

WGD events have a more significant impact on the short-term evolution of polyploids.

This explains associations between WGD events and surviving environmental turmoil (e.g., the Cretaceous-Paleogene extinction and glaciation events).

WGD- and SSD-derived motifs are associated with different functions

Functional enrichment of GO terms, InterPro domains, and TF families.

WGD: growth and development, especially dosage dependent-processes, e.g.:

  • translation
  • transcriptional regulation
  • histone modifications
  • cell cycle regulation
  • carbohydrate and lipid metabolism

SSD: response to stress and environmental stimuli, e.g.:

  • oxidative stress
  • pathogenesis-related proteins
  • recognition of pathogen-associated molecular patterns
  • WRKY, ERF, and NAC TF families

WGD- and SSD-derived motifs are associated with different functions

Functional enrichment of GO terms, InterPro domains, and TF families.

WGD: growth and development, especially dosage dependent-processes, e.g.:

  • translation
  • transcriptional regulation
  • histone modifications
  • cell cycle regulation
  • carbohydrate and lipid metabolism

SSD: response to stress and environmental stimuli, e.g.:

  • oxidative stress
  • pathogenesis-related proteins
  • recognition of pathogen-associated molecular patterns
  • WRKY, ERF, and NAC TF families

Conclusion

The patterns observed for WGD- and SSD-derived motifs are very similar to what has been observed for WGD- and SSD-derived genes.

Take-home messages

Dosage balance imposes selective constraints to WGD-derived genes that lead to:

  1. Slower evolution at the sequence level (less substitutions/time)
  2. Slower evolution at the PPI level (less changes in partners/time)

WGD has a more significant impact in the short-term evolution of polyploids, but WGD-derived motifs are lost over time.

WGD contributes to GRNs with genes related to growth and development, while SSD contributes with stress-related genes.

Further reading

DOI: 10.1093/molbev/msad141

Acknowledgements


Dr. Yves Van de Peer (supervision)

Ghent University and ERC (funding)

VIB Center for Plant Systems Biology (infrastructure)

Here’s where you can find me:






almeidasilvaf

almeidasilvaf

https://almeidasilvaf.github.io

Fabricio Almeida-Silva

0000-0002-5314-2964